*Replication Package for Inflation and Attention Thresholds
*by: Oleg Korenok, David Munro, and Jiayi Chen

*You need to have the xthreg package installed to run some of the code below


*Set Your Directory:
local Dir "/Users/......"

*===============Google Results=================== 
clear
u "`Dir'/Google_Hits.dta"

*Panel regressions for Table 1:
xthreg TotalHits, rx(CPI) qx(CPI) thnum(1) grid(400) bs(300) trim(0.05)
xthreg TotalHits L.CPI, rx(CPI) qx(CPI) thnum(1) grid(400) bs(300) trim(0.05)
xthreg TotalHits L.CPI L2.CPI, rx(CPI) qx(CPI) thnum(1) grid(400) bs(300) trim(0.05)

*Run simple linear panel regressions to see if threshold model improves fit 
*(compute F-stat using SSRs)
xtreg TotalHits CPI, fe
xtreg TotalHits CPI L.CPI, fe
xtreg TotalHits CPI L.CPI L2.CPI, fe

*Check up to 3 thresholds, only single threshold is significant
xthreg TotalHits, rx(CPI) qx(CPI) thnum(3) bs(300 300 300) trim(0.05 0.05 0.05)

*robustness check to see if the results hold on pre-pandemic data
xthreg TotalHits if Date<td(1jan2020), rx(CPI) qx(CPI) thnum(1) bs(300) trim(0.05)

bysort CountryID: gen DateID=_n
xtset CountryID DateID

*US specific regressions in Table 1:

threshold TotalHits if CountryID==41, threshvar(CPI) regionvars(CPI) 
*Versus simple linear model to see if significant (Compare SSRs):
reg TotalHits CPI if CountryID==41

*Same with one lag
threshold TotalHits L.CPI if CountryID==41, threshvar(CPI) regionvars(CPI) 
reg TotalHits CPI L.CPI if CountryID==41

*Same with two lags
threshold TotalHits L.CPI L2.CPI if CountryID==41, threshvar(CPI) regionvars(CPI) 
reg TotalHits CPI L.CPI L2.CPI if CountryID==41

*US with inflation uncertainty or UR controls (Table A.1)
threshold TotalHits Inf_Uncert_1 if CountryID==41, threshvar(CPI) regionvars(CPI) 
reg TotalHits CPI Inf_Uncert_1 if CountryID==41

threshold TotalHits Inf_Uncert_5 if CountryID==41, threshvar(CPI) regionvars(CPI)
reg TotalHits CPI Inf_Uncert_5 if CountryID==41 

threshold TotalHits UR if CountryID==41, threshvar(CPI) regionvars(CPI) 
reg TotalHits CPI UR if CountryID==41


*Loop through all individual countries, and generate graphs for all countries
matrix drop _all
levelsof CountryID
foreach lev in `r(levels)' {
di(`lev')
reg TotalHits CPI if CountryID==`lev'
matrix z1=e(rss)
matrix p1 = nullmat(p1)\z1
threshold TotalHits if CountryID==`lev', threshvar(CPI) regionvars(CPI)
matrix t=e(thresholds)
matrix z2=e(ssr)
scalar m0 = t[1,2]
matrix x = nullmat(x)\ m0
matrix p2 = nullmat(p2)\z2

twoway (scatter TotalHits CPI if CountryID==`lev') (lfit TotalHits CPI if CountryID==`lev'&CPI<m0) (lfit TotalHits CPI if CountryID==`lev'&CPI>=m0), graphregion(color(white)) legend(order(1 "Google Search Index" 2 "Fit - Below Threshold" 3 "Fit - Above Threshold") pos(11) col(1) ring(0) symxsize(5) size(small)) ytitle("Google Search Index") xtitle("Inflation")
graph export "`Dir'/Figures/Scatter`lev'.png" , as(png) replace
}
matrix list x
matrix list p1
matrix list p2


*Redo the individual country estimates to match the twitter time frame.

*threshold can't deal with gaps. Some countries very early in the data have missing twitter data, fix this here: (this is to be consistent with the time frames in the twitter results)

*Run this only for apples to apples twitter comparison, for the post 2010 robustness check on geographic improvement in Trends index don't need to run this chunk.
drop if CountryID==5 & mdate<tm(2012m1)
drop if CountryID==21 & mdate<tm(2011m4)
drop if CountryID==6 & mdate<tm(2014m4)
drop if CountryID==22 & mdate<tm(2014m4)
drop if CountryID==39 & mdate<tm(2011m7)
drop if CountryID==12 & mdate<tm(2011m4)
drop if CountryID==4 & mdate<tm(2011m4)
drop if CountryID==3 & mdate<tm(2011m4)

*loop through all countries and generate graphs
matrix drop _all
levelsof CountryID
foreach lev in `r(levels)' {
di(`lev')
reg TotalHits CPI if CountryID==`lev'& DateID>84
matrix z1=e(rss)
matrix p1 = nullmat(p1)\z1
threshold TotalHits if CountryID==`lev' & DateID>84, threshvar(CPI) regionvars(CPI)
matrix t=e(thresholds)
matrix z2=e(ssr)
scalar m0 = t[1,2]
matrix x = nullmat(x)\ m0
matrix p2 = nullmat(p2)\z2

twoway (scatter TotalHits CPI if CountryID==`lev'&DateID>84) (lfit TotalHits CPI if CountryID==`lev'&CPI<scalar(m0) & DateID>84) (lfit TotalHits CPI if CountryID==`lev'&CPI>=scalar(m0) & DateID>84), graphregion(color(white)) legend(order(1 "Google Search Index" 2 "Fit - Below Threshold" 3 "Fit - Above Threshold") pos(11) col(1) ring(0) symxsize(5) size(small)) ytitle("Google Search Index") xtitle("Inflation")
graph export "`Dir'/Figures/ScatterPost2010`lev'.png" , as(png) replace
}
matrix list x
matrix list p1
matrix list p2

*===================Twitter data=======

clear

u "`Dir'/Twitter_Dat.dta"

*threshold can't deal with gaps. Some countries very early in the data have missing data, fix this here:

drop if CountryID==5 & mdate<tm(2012m1)
drop if CountryID==21 & mdate<tm(2011m4)
drop if CountryID==6 & mdate<tm(2014m4)
drop if CountryID==22 & mdate<tm(2014m4)
drop if CountryID==39 & mdate<tm(2011m7)
drop if CountryID==12 & mdate<tm(2011m4)
drop if CountryID==4 & mdate<tm(2011m4)
drop if CountryID==3 & mdate<tm(2011m4)

matrix drop _all
scalar drop _all
levelsof CountryID
foreach lev in `r(levels)' {
di(`lev')
reg Tweets_Norm CPI if CountryID==`lev'
matrix z1=e(rss)
matrix p1 = nullmat(p1)\z1
threshold Tweets_Norm if CountryID==`lev', threshvar(CPI) regionvars(CPI)
matrix t=e(thresholds)
matrix z2=e(ssr)
scalar m = t[1,2]
scalar list m
matrix x = nullmat(x)\ m
matrix p2 = nullmat(p2)\z2

twoway (scatter Tweets_Norm CPI if CountryID==`lev') (lfit Tweets_Norm CPI if CountryID==`lev'&CPI<scalar(m)) (lfit Tweets_Norm CPI if CountryID==`lev'&CPI>=scalar(m)), graphregion(color(white)) legend(order(1 "Twitter Frequency" 2 "Fit - Below Threshold" 3 "Fit - Above Threshold") pos(11) col(1) ring(0) symxsize(5) size(small)) ytitle("Twitter Frequency") xtitle("Inflation")
graph export "`Dir'/Figures/ScatterTwitter`lev'.png" , as(png) replace
}
matrix list x
matrix list p1
matrix list p2


*======================Newspaper Data=============
*See appendix for details on data construction

clear
u "`Dir'/News_Dat.dta"


*USA
threshold ShareArticles if CountryID==6, threshvar(CPI) regionvars(CPI)
reg ShareArticles CPI if CountryID==6

twoway (scatter ShareArticles CPI if CountryID==6) (lfit ShareArticles CPI if CPI<3.77&CountryID==6) (lfit ShareArticles CPI if CPI>=3.77&CountryID==6), graphregion(color(white)) legend(order(1 "Newspaper Coverage" 2 "Fit - Below Threshold" 3 "Fit - Above Threshold") pos(11) col(1) ring(0) symxsize(5) size(small)) ytitle("Inflation Newspaper Articles") xtitle("Inflation") xsc(r(-2,9)) xlabel(-2(1)9)

graph export "`Dir'/Figures/ScatterNewsUSA.png" , as(png) replace

*Canada
threshold ShareArticles if CountryID==1, threshvar(CPI) regionvars(CPI)
reg ShareArticles CPI if CountryID==1

twoway (scatter ShareArticles CPI if CountryID==1) (lfit ShareArticles CPI if CPI<2.45&CountryID==1) (lfit ShareArticles CPI if CPI>=2.45&CountryID==1), graphregion(color(white)) legend(order(1 "Newspaper Coverage" 2 "Fit - Below Threshold" 3 "Fit - Above Threshold") pos(11) col(1) ring(0) symxsize(5) size(small)) ytitle("Inflation Newspaper Articles") xtitle("Inflation")

graph export "`Dir'/Figures/ScatterNewsCAN.png" , as(png) replace

*Germany

threshold ShareArticles if CountryID==2, threshvar(CPI) regionvars(CPI)
reg ShareArticles CPI if CountryID==2

twoway (scatter ShareArticles CPI if CountryID==2) (lfit ShareArticles CPI if CPI<1.4&CountryID==2) (lfit ShareArticles CPI if CPI>=1.4&CountryID==2), graphregion(color(white)) legend(order(1 "Newspaper Coverage" 2 "Fit - Below Threshold" 3 "Fit - Above Threshold") pos(11) col(1) ring(0) symxsize(5) size(small)) ytitle("Inflation Newspaper Articles") xtitle("Inflation")

graph export "`Dir'/Figures/ScatterNewsDEU.png" , as(png) replace

*UK

threshold ShareArticles if CountryID==3, threshvar(CPI) regionvars(CPI)
reg ShareArticles CPI if CountryID==3

twoway (scatter ShareArticles CPI if CountryID==3) (lfit ShareArticles CPI if CPI<2.3&CountryID==3) (lfit ShareArticles CPI if CPI>=2.3&CountryID==3), graphregion(color(white)) legend(order(1 "Newspaper Coverage" 2 "Fit - Below Threshold" 3 "Fit - Above Threshold") pos(11) col(1) ring(0) symxsize(5) size(small)) ytitle("Inflation Newspaper Articles") xtitle("Inflation")

graph export "`Dir'/Figures/ScatterNewsGBR.png" , as(png) replace

*Japan

threshold ShareArticles if CountryID==4, threshvar(CPI) regionvars(CPI)
reg ShareArticles CPI if CountryID==4

twoway (scatter ShareArticles CPI if CountryID==4) (lfit ShareArticles CPI if CPI<-0.1&CountryID==4) (lfit ShareArticles CPI if CPI>=-0.1&CountryID==4), graphregion(color(white)) legend(order(1 "Newspaper Coverage" 2 "Fit - Below Threshold" 3 "Fit - Above Threshold") pos(11) col(1) ring(0) symxsize(5) size(small)) ytitle("Inflation Newspaper Articles") xtitle("Inflation")

graph export "`Dir'/Figures/ScatterNewsJPN.png" , as(png) replace

*Turkey

threshold ShareArticles if CountryID==5, threshvar(CPI) regionvars(CPI)
reg ShareArticles CPI if CountryID==5

twoway (scatter ShareArticles CPI if CountryID==5) (lfit ShareArticles CPI if CPI<7.14&CountryID==5) (lfit ShareArticles CPI if CPI>=7.14&CountryID==5), graphregion(color(white)) legend(order(1 "Newspaper Coverage" 2 "Fit - Below Threshold" 3 "Fit - Above Threshold") pos(11) col(1) ring(0) symxsize(5) size(small)) ytitle("Inflation Newspaper Articles") xtitle("Inflation")

graph export "`Dir'/Figures/ScatterNewsTUR.png" , as(png) replace


*====== Historical US Newspaper Data===========
clear
u "`Dir'/Hist_US_News_Dat.dta"

twoway line NormalizedArticles mdate, ytitle("Share of Inflation Articles") || line Inflation mdate, yaxis(2) ytitle("Inflation", axis(2)) graphregion(color(white)) legend(order(1 "Newspaper Coverage" 2 "Inflation") pos(11) col(1) ring(0) symxsize(5) size(small)) xtitle("")

graph export "`Dir'/Figures/TimeSeriesNewsHist.png" , as(png) replace

*1970s example

threshold NormalizedArticles if mdate>tm(1953m01) & mdate<tm(1970m02), threshvar(Inflation) regionvars(Inflation)
reg NormalizedArticles Inflation if mdate>tm(1953m01) & mdate<tm(1970m02)

twoway (scatter NormalizedArticles Inflation if mdate>tm(1953m01) & mdate<tm(1970m02)) (lfit NormalizedArticles Inflation if Inflation<3.9394 & mdate>tm(1953m01) & mdate<tm(1970m02)) (lfit NormalizedArticles Inflation if Inflation>=3.9394 & mdate>tm(1953m01) & mdate<tm(1970m02)), graphregion(color(white)) legend(order(1 "Historical Newspaper Coverage" 2 "Fit - Below Threshold" 3 "Fit - Above Threshold") pos(11) col(1) ring(0) symxsize(5) size(small)) ytitle("Inflation Newspaper Articles") xtitle("Inflation") xsc(r(-2,9)) xlabel(-2(1)9)

graph export "`Dir'/Figures/ScatterHistNewsUSA.png" , as(png) replace


* ===== Remaining Figures ======
*The remaining figures plot some some of the regression results which are generated in the code above. We don't generate these on the fly, so we've included separate tables which contain the underlying data.

*habituation plot (Figure 4)
clear
u "`Dir'/Habit_Dat.dta"

twoway (scatter Threshold_Google AvgInf if Consistent==1| Intermediate==1,graphregion(color(white))) (lfit Threshold_Google AvgInf if Consistent==1| Intermediate==1,graphregion(color(white))), ytitle("Estimated Threshold") xtitle("Average Inflation") legend(off)

graph export "`Dir'/Figures/Habit.png" , as(png) replace

*histogram of thresholds (Figure A.3)
clear
u "`Dir'/Hist_Dat.dta"

hist Threshold_Google, bin(10) freq graphregion(color(white)) xtitle("Estimated Threshold")




